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Abstract 

We consider the problem of risk-sensitive control of a stochastic network. In controlling such 
a network, an escape time criterion can be useful if one wishes to regulate the occurrence of large 
buffers and buffer overflow. In this paper a risk-sensitive escape time criterion is formulated, 
which in comparison to the ordinary escape time criteria penalizes exits which occur on short 
time intervals more heavily. The properties of the risk-sensitive problem are studied in the 
large buffer limit, and related to the value of a deterministic differential game with constrained 
dynamics. We prove that the game has value, and that the value is the (viscosity) solution of 
a PDE. For a simple network, the value is computed, demonstrating the applicability of the 
approach. 



1 Introduction 

In this paper we consider a problem of risk-sensitive control (or rare event control) for queueing 
networks. The network includes servers that can offer service to two or more classes of customers, 
and a choice must be made regarding which classes to offer service at each time. We study a 
stochastic control problem in which this choice is regarded as the control, and where the cost is a 
risk-sensitive version of the time to escape a bounded set. Hence, fixing c > 0, and denoting by a the 
time when the queue-lengths process first exits a given domain, we consider E x e~ ca as a criterion 
to be minimized. Such a criterion penalizes short exit times more heavily than ordinary escape 
time criteria (such as E x a, a criterion to be maximized). There are at least two motivations for the 
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use of such criteria when designing policies for the control of a network. The first is that in many 
communication networks system performance is measured in terms of rare event probabilities (e.g., 
probabilities of data loss or excessive delay). The second motivation follows from the connection 
between risk-sensitive controls and robust controls. Indeed, as discussed in ^3], the optimization 
of a single fixed stochastic network with respect to a risk-sensitive cost criteria automatically 
produces controls with specific and predictable robust properties. In particular, these controls give 
good performance for a family of perturbed network models (where the perturbation is around the 
design model and the size of the perturbation is measured by relative entropy), and with respect 
to a corresponding ordinary (i.e., not risk-sensitive) cost. 

In many problems, one considers the limit of the risk-sensitive problem as a scaling parameter 
of the system converges, in the hope that the limit model is more tractable. We follow the same 
approach here, and show that the normalized costs in the risk-sensitive problems converge to the 
value function of a differential game with constraints. As is well known, the convergence analysis 
is closely related to the large deviation properties of the sequence of controlled processes. An 
interesting feature in the setting of stochastic networks is that the asymptotic analysis of a sequence 
of controlled networks is in many ways simpler than the analogous asymptotic analysis of a sequence 
of uncontrolled networks. For example, if one were to fix a particular state feedback service policy 
at each station, then the calculation of the large deviation asymptotics is very difficult. In contrast, 
it turns out that calculation of the large deviation asymptotics of the optimally controlled network 
is quite feasible. This is largely due to the fact that a fixed service policy invariably includes 
some state discontinuities. For example, a priority policy switches drastically when the highest 
priority queue empties. When the policy is left as a parameter that is to be optimized these sharp 
discontinuities are not dealt with directly, since the control and the large deviation behavior are 
identified simultaneously. The situation is analogous to one found in the control of unconstrained 
processes such as diffusions. If a fixed nonsmooth feedback control is considered then large deviation 
asymptotics are generally intractable, but when the combined large deviation and optimal control 
problem is considered, much is possible |15j . 

For simplicity, we restrict in this paper to a class of Markovian networks, and consider just one 
simple cost structure. Much more general statistical models can be treated with similar arguments, 
as can a more general cost. A more fundamental restriction is on the routing in the network. 
We assume a re-entrant line structure, so that the input streams follow a fixed route through the 
network-we do not allow either randomized or controlled routing. Relaxing the last conditions 
leads to a problem that is significantly more difficult to analyze, and would require a considerable 
extension of the results we prove. 

The deterministic game that is associated with the limit stochastic control problem involves 
two players. One player allocates service in a way analogous to the control in the stochastic control 
problem, and the other player perturbs the service and arrival rates. The cost is expressed in terms 
of the large deviation rate function for the underlying arrival and service processes, cumulated up 
to the time the dynamics exit the domain. Heuristically, the first player identifies those classes it 
is most worthwhile to allocate service to, so as to delay the escape as much as possible and thereby 
maximize the cost. The player who selects the perturbed rates attempts to minimize the cost by 
driving it out of the domain, while paying a cost for perturbing the rates. 
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Our main result states that as the scaling parameter of the system converges, the value for 
the stochastic control problem converges to the value of the game. By way of proving the result, 
we also show that the Hamilton-Jacobi-Bellman equation associated with the game has a unique 
(Lipschitz continuous) viscosity solution. 

Several works have considered problems of optimal exit probabilities in the context of con- 
trolled diffusion processes, in the asymptotically small white noise intensity regime. Fleming and 
Souganidis JH] use viscosity solutions techniques to study a controlled diffusion where the control 
enters in the drift coefficient. Dupuis and Kushner extend their results to the case where the 
diffusion coefficient is possibly degenerate. Their technique relates the stochastic control problem 
to the game in a more direct way, using time discretization, without involving PDE analysis. The 
stochastic control problem studied in the current paper has the property that the jump rates in 
certain directions (those that correspond to services, not to arrivals) can be controlled to assume 
arbitrarily low values, including zero. It appears to be a more subtle problem than the ones in the 
above cited papers, in that it is analogous to a controlled diffusion problem where the control enters 
also in the diffusion coefficient, and where no uniform non-degeneracy condition is assumed. This 
kind of degeneracy makes it difficult to apply the time discretization idea of The main idea 
of jllj . in which one directly relates the control problem to the game, is still fruitful in the current 
setting. Following this approach, we relate the limit inferior [resp., superior] of the asymptotic value 
for the control problem to the upper [resp., lower] value of the game. However, showing that the 
game has value and thereby obtaining the full convergence result for the control problem requires 
a PDE analysis. 

The PDE analysis uses viscosity solutions methods. There are three types of boundary condi- 
tions associated with the PDE: Neumann, Dirichlet, and "state space constraint." The first two 
types of boundary conditions correspond in the game to the nonnegativity constraint on queue 
lengths and to stopping upon exit from the domain, respectively. The third type of boundary 
condition arises when there are portions of the boundary where exit can be blocked unilaterally by 
one of the players, and it is optimal for it to do so. It is well known since Soner [23] that such a 
scenario leads to the last boundary condition mentioned above. Combining techniques of and 
|25| . we prove uniqueness of viscosity solutions for the PDE and show that the game's upper and 
lower values are viscosity solutions, thus establishing existence of value. The trivial but crucial fact 
used in the uniqueness proof is that the Isaacs condition holds (equation (|H8))). 

As an example, we analyze a case where the domain is a hyper-rectangle, and where the network 
consists of one server and many queues, each customer requiring service only once. We find an 
explicit solution to the corresponding PDE, assuming the parameter c is large enough. This is only 
an initial result in this direction, but it shows that explicit solutions can be found. The solution 
turns out to be of particularly simple form (see equation (|49|) ). The optimal service discipline 
stemming from the solution corresponds to giving priority to class i whenever the state of the 
system is within a subset Gi of the domain. The partitioning of the domain into subsets has a 
simple structure too (see Figure |2] in Section |S] for an example in two dimensions) . See [2] for 
explicit solutions in the case of tandem queues, as well as identities relating the perturbed rates 
with the unperturbed ones in a more general network. 

There is relatively little work on risk-sensitive and robust control of networks. Ball et. al. have 
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considered a robust formulation for network problems arising in vehicular traffic jlj [3] , and have 
explicitly identified the value function in certain instances. Although their model is similar to ours 
in that the network dynamics are modeled via a Skorokhod Problem, many other features, most 
notably the cost structure, are qualitatively different. In addition, the model they consider is not 
naturally related to a risk-sensitive control problem for a jump Markov model of a network. 

The organization of the paper is as follows. Section ^introduces the network and the stochastic 
control problem, describes a key tool in our analysis, namely the Skorokhod Problem (SP), intro- 
duces the differential game, and states the main result. Section establishes the relation between 
the control problem asymptotics and the game's upper and lower values. Characterization of the 
upper and lower values of the game as viscosity solutions of a PDE, as well as uniqueness for this 
PDE are established in Section 0] Section |S] presents an example, and the paper concludes with 
Section El which gives the proofs of several lemmas. Throughout the paper, numbering such as 
Lemma a. b refers to the 6th item of Lemma a. 

2 Problem setting and the main result 

The queueing network control problem. We consider a system with J customer classes, and 
without loss assume that each class is identified with a queue at one of K servers. Each server 
provides service to at least one class. Thus if C{k) denotes the set of classes that are served by 
k, then the control determines who receives service effort at server k from among i € C{k). In 
particular, the sets C(k), k = 1,...,K are disjoint, with UkC(k) = {1, . . . , J}. The state of the 
network is the vector of queue lengths, denoted by X. After a customer of class i is served, it turns 
into a customer of class r(i), where i = is used to denote the "outside." We let ej denote the unit 
vector in direction j and set eo = so that following service to class j the state changes by e r tj\ — ej. 
The control will be described by the vector u = (u\, ...,uj), where Ui = 1 if class i customers are 
given service and ui = otherwise. Since service can be given at any moment to only one class at 
each station, the control vector must satisfy J2ieC(k) u i — ^ f° r eacn We next consider the scaled 
process X n under the scaling which accelerates time by a factor of n and shrinks space by the 
same factor. We are interested in a risk-sensitive cost functional that is associated with exit from 
a bounded set. Let G be a bounded subset of IR+ that contains the origin (additional assumptions 
on G are given in Condition^). Define 



from x, and c > is a constant. With this cost structure "risk-sensitivity" means that atypically 
short exit times are weighted heavily by the cost. A "good" control will avoid such an event with 
high probability. The significance from the point of view of stabilization of the system is clear. (See 
also ^Hl for the robust interpretation). 

A precise description of the stochastic control problem is as follows. Let G n = n~ l % J + n G. 
Define 



a n = mf{t : X n (t) G}. 



Then the control problem is to minimize the cost E x e 



.71 



where E x denotes expectation starting 
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For u G U and / : % J -» IR let 

J J 
= Y, ^ [/(* + ej) ~ f{x)} + J2 u jN l {x+ ,. eZ j } [f(x + v-) - f{x)] , (1) 

3=1 3=1 

where Vj = e r m — ej. It is assumed that for each i, Aj > 0, while > 0. For each n S IN consider 
the scaling defined by 

C n ' u f(x) = nC u g(nx), (2) 

where / : n~ l /Z J — > IR and g(-) = /(n -1 -). A controlled Markov process starting from x will consist 
of a complete filtered probability space (O, JF, (Ft), P™' u ), a state process X n taking values in G n 
that is continuous from the right and with limits from the left, a control process u taking values in 
U, such that X n is adapted to J^t, u is measurable and adapted to Tt, P^ ,n (X n (0) = x) = 1, and 
for every function / : n~ l /Z J — ► IR 

/(X n (t))- f' C n ' u ^f(X n (s))ds 
Jo 

is an ^—martingale, i?™'" denotes expectation with respect to -P"' 11 . For a parameter c > 0, the 
value function for the stochastic control problem is defined by 

V n {x) = - inf n~ x log E^ n e- nca " , x £ G n . (3) 

In this definition the infimum is over all controlled Markov processes. 

A measurable function u(x,t), u : G n x [0, oo) — » {/ is said to be a feedback control. We will 
make use of two well known facts: to each feedback control there corresponds a controlled Markov 
process with u(t) = u(X n (t),t), and in the definition of the value function the infimum can be 
restricted to feedback controls. 

In the formulation just given we allow the maximizing player to choose a control from the 
convex set U. This is a relaxed formulation, which allows the server to simultaneously split the 
effort between 2 or more customer classes. An alternative control space that is more natural in 
implementation consists of only the vertices of U, in which case the server can only server one class 
at a time. In a general game setting, the distinction between such "relaxed" and "pure" control 
spaces can be significant. However, in the present setting it will turn out that the value is the 
same for both cases. This is essentially due to the fact that the game arises from a risk-sensitive 
control problem, which imposes additional structure on the game, and will be further commented 
on below. 

Dynamics via the Skorokhod Problem. Our main goal will be to study the asymptotics of V n , 
and in particular, to show that they are governed by the value of a deterministic differential game. 
In order to define the dynamics of this game we first need a formulation of the Skorokhod Problem 
(SP). We give here the simplest formulation which covers our needs. The reader is referred to J2j 
for a more general framework. Let 

D+([0, oo) : IR J ) = |?/> G D([0, oo) : IR J ) : V(0) € IR^} , 

where -D([0, oo) : IR^) is the space of left continuous functions with right hand limits, endowed with 
the uniform on compacts topology. When restricting to continuous functions we replace U D" with 
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"C" . Let a set of vectors {7^ , i 
on <91Ri - the boundary of Uli 



d(x) = 



, J} be given and set I(x) = {i : X{ = 0}. For each point x 



en > 0, 




-} 









The Skorokhod Map (SM) assigns to every path ^ € Z) + ([0, 00) : IR J ) a path that starts at 
(f)(0) = ip(0), but is constrained to IR^ as follows. If (f> is in the interior of IR/{ then the evolution 
of <p mimics that of ip, in that the increments of the two functions are the same until <f> hits <9IR+. 
When (f) is on the boundary a constraining "force" is applied to keep <f> in the domain, and this 
force can only be applied in one of the directions d((f>(t)), and only for t such that <f>(t) is on the 
boundary. The precise definition is as follows. For 7] G D([0, 00) : IR J ) and t G [0, 00) we let \f]\(t) 
denote the total variation of rj on [0, t] with respect to the Euclidean norm on TR J . 

Definition 1 Let tp G D + ([0, 00) : TR J ) be given. Then ((f), r]) solves the SP for ip (with respect to 
IR+ and ji,i = 1, J) if (f)(0) = if>(0), and if for all t G [0, 00) 



1. (f>(t)=iP(t)+r,(t), 

2. <f>(t) G Mi, 

3. \rj\(t) < 00, 

4- \v\(t) = J[0,t] 1 W,)e8m.$} d \ r l\( 8 )> 

5. There exists a Borel measurable function 7 : [0, 00) — > IR+ such that 
7(t) G d((f)(t)), and such that 



-almost everywhere 



r)(t) 



[o,t] 



j(s)d\rj\(s). 



Under a certain condition on {7^} (known in the literature as the completely-S condition 22 ]), it is 
known that solutions to the SP exist in all of D + ([0, 00) : IR+). Under further conditions (namely, 
existence of the set B - see |2U1 1121 IT7] and also Lemma below), it is known that the Skorokhod 
Map is Lipschitz continuous, and consequently the solution is unique. Denoting the map if) 1— > (f) in 
Definition^ by T, the Lipschitz property states that there is a constant K\ such that 

sup ||r(^)(t)-r(^)(t)||<^i sup \\Mt) - M*)l ^2 e £>+([o,oo) : ir j ). (4) 

te[o,oo) te[o,oo) 

The SP that will be considered here is the one for which jt = — e r ^ = — Sj. For this problem, 
the following is well known. 



Theorem 1 ( [201 IT7| ) TTie SP associated with the domain IR^ and the constraint vectors Ji,i = 
1,..., J possesses a unique solution, and the Skorokhod Map is Lipschitz continuous on the space 
D + ([0, oo) : IR^.). Moreover, the Skorokhod Map takes C+([0,oo) : IR J ) into C+([0,oo) : IR J ), and 
therefore T((f>) is continuous if <p is. 
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We next define a constrained ordinary differential equation. As is proved (in greater generality) 
in one can define a projection ir : IR' 7 — ► IR+ that is consistent with the constraint directions 
= 1, J}, in that ir(x) = x if x G IR+, and if x ^ IR+ then 7r(x) — x = ar, where a > 
and r G d(-7r(x)). With this projection given, we can now define for each point x G <9IR+ and each 
v £ TR J the projected velocity 

, \ . -i . 7r(a; + At>) - 7r(x) 
7r(a?, u) = hm . 

For details on why this limit is always well defined and further properties of the projected velocity 
we refer to Section 3 and Lemma 3.8]. Let v : [0, oo) — > TR J have the property that each 
component of v is integrable over each interval [0, T], T < oo. Then the ODE of interest takes the 
form 

0(t) = 7r(<Ki), v(t)), 0(0) = cf> G U J + . (5) 

An absolutely continuous function <fi : [0, oo) — > IR+ is a solution to © if the equation is satisfied 
a.e. in t. By using the regularity properties © of the associated Skorokhod Map and because of the 
particularly simple nature of the right hand side, one can show that <p solves © if and only if <j) is 
the image of ip(t) = Jq v(s)ds + x under the Skorokhod Map, and thus all the standard qualitative 
properties (existence and uniqueness of solutions, stability with respect to perturbations, etc.) hold 

PUDS]. 

As mentioned above, the SP formulation will be our means of defining the dynamics of a 
deterministic game. Before discussing this game, let us show how the same formulation is also useful 
for the stochastic control problem defined earlier in this section. First, since Vj = e r (j) ~ e j = 
it is easy to verify that for the particular SP considered here ir(x, v) = vl x+ve %j for all x € 

and v G {v>j : j = 1, . . . , J}. Therefore the generator C u of Q can also be written as 

J J 

£ u m = E x M x + e i) - /(^i + E + ^ «i)) - /(*)]• 

3=1 3=1 

A measurable function u(t), u : [0, oo) — > U is said to be an open loop control. Note that this control 
has no state feedback. When u is an open loop control, it is possible to write the corresponding 
controlled process X as T(Y). The process Y, which will be called the unconstrained controlled 
process, is a controlled Markov process with a simpler structure. To be precise, let 

J J 

c u f( x ) = E a, [f(x + ej ) - /(*)] + E »m [/(* + *j) - /(»)] > 

3=1 J=l 

and let £q'" be defined analogously to (J2J). A controlled Markov process X n on G n [respectively, 
y n on n~ 1 /Z J ] is defined as before, but now using the generator (C n < u f){t) = (£»>«(*) /) (x) [resp., 
(CQ' u f)(t) = (Cq' u ^ f)(x)]. The simplification that the SP introduces is that if u is an open loop 
control, and if Y n is a controlled Markov process corresponding to Cq' u on (f2,.F, (^t), P£' u ), then 
A n = r(y ra ) is a controlled Markov process corresponding to C n,u on the same filtered probability 
space. The role played by the SP in relating constrained and unconstrained processes is exhibited 
here in a simple fashion, for introductory purposes. We will, in fact, use it in a slightly more 
complicated setting later on in Lemma and Lemma El 
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A differential game. In this paper, we prove that the value function V n (x) for a stochastic control 
problem associated with our queueing network model is approximately equal (for large n) to the 
value function of a related differential game. In addition, the dynamics of this game are defined 
in terms of an associated SP. Before introducing the game formally we explain why this is to be 
expected. In a problem with no control, the exponential decay rate of quantities such as Ee~ an(Tn is 
given in terms of the sample path large deviation rate function associated with the process, which 
in turn can be expressed in terms of the rate function for the Poisson primitives that drive the 
model. This is supported by the well known Laplace's principle JO]- Heuristically, one thinks of 
the rate function as a cost paid for changing the measure so as to make the rare event of exiting 
on short time interval a probable event. Laplace's principle asserts that the decay rate can be 
expressed as the solution to a deterministic optimization problem involving the cost —ca combined 
with the cost of changing the measure: cf. |241 Eq. (5.20)-(5.23)]. When the stochastic model 
involves optimal control, there is one more variable to optimize over in the limit, and this results 
in a game. The game's deterministic dynamics are the natural law of large numbers limit under 
the changed measure. Boundary constraints and constraining meachanisms which are present in 
the prelimit model are represented in the limit model by the SP. The cost for the game involves 
the large deviation rate function for the Poisson primitives, the time till the dynamics exits the 
domain, and the parameter c. 

We thus consider a zero sum game involving two players. One (which we call the maximizing 
player) selects the service allocation and attempts to maximize. The other (called the minimizing 
player) chooses the perturbed arrival and service rates and attempts to minimize. Throughout, the 
perturbed rates will be denoted by an overbar, as in Aj,/Xj. 

Recall that for u € U, it, stands for the fraction of service effort given to class i. The control 
space for the maximizing player is 

U = {u : [0, oo) — > U ; u is measurable}. 

Let I : IR — > IR+ U {+00} be defined as 



l(x) = 



xlogx — x + 1 x > 0, 
+00 x < 0, 



where OlogO = 0. Denoting M = [0,oo) 2J , the control space for the minimizing player will be 

M = {m = (Ai, . . . , Aj, /tli, ... , fij) '■ [0, 00) — » M; m is measurable, lomis locally integrable}. 

(6) 

For u £ U and m £ M define 



J J 



}(u, m) = XjVj + Ui/j 



j=l i=l 

where Vj = ej, and as before Vi = e r u\ — e,. Then the dynamics are given by 

f <j>(t) = Tr(<f>(t),v(u(t),m(t))), 
\ 0(0) = x. 
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To define the cost for the game, let p : U x M — > IR + U {+00} be 

p(u, m) = y2 ^ ( "T ] + Ui ^ il (~ 

i=i W i=i v ^ 

By convention, if Aj = and Xi > for some i, we let p = 00 (recall that by assumption, m > 0). 
Let the exit time be defined by 

a = inf{t : 0(4) G}. 
With c > as in ©, the cost is given by 

C(x,u, m) = / [c + p(u(t), m(t))]dt. 





As in 18, we need the notion of strategies. We endow both U and M with the metric p(ui,U2) = 
J2 n 2~ n (/o™ l^i (*) —W2(t)\dtA 1), and with the corresponding Borel cr-fields. A mapping a : M — > U 
is called a strategy for the maximizing player if it is measurable and if for every m, rh € M and 
t > such that 

m(s) = m(s) for a.e. s S [0, t], 

one has 

a[m](s) = a[m](s) for a.e. s € [0,t]. 

In an analogous way, one defines a mapping /3 : U — > M to be a strategy for the minimizing player. 
The set of all strategies for the maximizing [resp., minimizing] player will be denoted by A [resp., 
B\. The lower value for the game is defined as 

V~(x) = inf sup C(x, u, /3[u]), 

and the upper value as 

V + (x) = sup inf_ C(x,a[m],m). 

To avoid confusion, we remark that despite the terms "upper" and "lower" value, it is not in general 
obvious that V~ < V + . 

Main result. We make the following assumption on the domain G. Let 

J+ = {ie{l,...,J}: Xi > 0}. 

Condition 1 We assume that the domain G satisfies one of the following. 

1. G is a rectangle given by 

G = {(xi, . . . ,x.j) : < Xi < Zi,i G J + ; < xj < zj,j J + }, 
for some Zi > 0, i = 1, . . . , J. 



9 



2. G is simply connected and bounded, and given by 

iej+ 

where for i £ we are given positive Lipschitz functions fa : IR/ -1 — > IR, and 
Gi = {(xi, . . . ,xj) € IR+ : < Xi < fa(xi,. . . , x i+ i, . . .,xj)}. 

This condition covers many typical constraints one would consider on buffer size, including separate 
constraints on individual queues (Condition 01) and one constraint on the sum of the queues 
(Condition 02) . 

The shape of the domain is simpler in Conditional) in that it is restricted to a hyper-rectangle. 
On the other hand, it is also possible under this condition for the maximizing player to unilaterally 
prevent an exit through a certain portion of dG \ <9IR+. Although it is in principle possible that 
the dynamics could exit through this portion of the boundary, it will always be optimal for the 
maximizing player to not allow it. Consider the simple network illustrated in Figure The 
maximizing player can prevent exit through the dashed portion of the boundary simply by stopping 
service at the first queue. As a consequence, there are in general three different types of boundary- 
the constraining boundary due to non-negativity constraints on queue length, the part of the 
boundary where exit can be blocked, and the remainder. These three types of boundary behavior 
will result, in the PDE analysis, in three types of boundary conditions. We now define the three 
portions of the boundary. Under Condition 01, let 

d c G = {(x\, . . . , xj) G G : Xj = Zj, some j J+\. 

For notational convenience, we let d c G = under Condition 02. In both cases we then set 

d Q G = dG\G, d+G = (G n dJR,{) \ d c G. 

Note that in both cases, d c G, d G and d + G partition dG. Also, d c G C G while d Q G C G c . As 
usual, we will denote G° = G \ dG and G = G U dG. d c G is the part of the boundary were the 
maximizing player can prevent the dynamics from exiting, and d D G is the part where it can not. 
Finally, it will be convenient to denote 

d co G = d Q G u d c G. 

Our main result is the following. 

Theorem 2 Let Conditional hold. Then V + = V~ =V on G. Moreover, if x n E G n , n 6 IN are 
such that x n — > x E G, then linin^oo V n {x n ) = V(x). 

Remark: A stronger form of the convergence statement in fact holds. Namely, 

limsuplimsupsup{|y ra (x) — V(y)\ : x G G n ,y G G, \x — y\ < e} = 0. 

40 n^oo 
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G 



Figure 1: A simple queueing network, a rectangular domain and three types of boundary. Full 
line: <9+G, dashed line: d c G, and dotted line: d G 

This is an immediate consequence of Theorem |2 and the fact that for each n, V n is Lipschitz on 
G n , with a constant that does not depend on n (Lemma |2J). I 

The proof is established in two major steps. Step 1 will be an immediate consequence of the 
main results of Section |31 and Step 2 will follow from Section 0J 

Step 1. We define a version of the game, technically easier to work with, in which all perturbed rates 
(Aj , /Zj) are bounded by b < oo. The corresponding upper and lower values, defined analogously, 
are denoted by V b,+ and V b ~. Then we show that for all b large enough (cf. Theorem [2J 

V b ' + (x) < liminf V n (x n ) < limsupF n (x n ) < V b '-{x). 

n >oo n — >co 

Step 2. We show that for b large, V b,+ = V '~ on G. To this end, we formulate a PDE for which 
we show that uniqueness of (Lipschitz) viscosity solutions holds (Theorem EJ), and also show that 
both V b ' + and V b ~ are viscosity solutions (Theorem |HJ). Since V n (x) does not depend on 6, neither 
do V b,± {x). Theorem |21 follows. 



3 The control problem and the game 



We begin by stating some basic properties of the stochastic control problem and of the deterministic 
game. The proofs of these properties are deferred to Section El 

Consider the following generators, defined for any u € U and m G M, for constrained and 
unconstrained controlled Markov processes: 



C n ' u ' m f{x) 



j 
J 



/(x+ity ]-/(.,•) 



/ [x + -Vj - f(x) 



n 



+ Y n f J -i 
i=l 

J 



I ( x + -ir(x,Vi) j - /(•'') 



i=l 



f \ X + -Vi) - f(x) 



n 



The definition of the corresponding controlled processes will be made precise in Lemmas and |H1 
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Owing to the logarithmic transform in ©, one expects V n to satisfy an Isaacs equation |19j . 
In fact, V n satisfies 

= sup uEU M meM [£ n > u > m V n (x) +C + P (u,m)}, x G G n , 
V n (x) = 0, x£G n . {> 

We comment that this is also the dynamic programming equation (DPE) for an associated stochastic 
game that is related to the deterministic game via a law of large numbers scaling and limit, and 
will not be further considered in this paper. 

Lemma 1 The value function V n of uniquely solves the DPE Qj. 

The following lemma gives a key estimate on the value function. 

Lemma 2 Under Condition Q V n (x) satisfies the Lipschitz property on (n -1 ^^) n G with a 
constant that does not depend on n G IN. Consequently, sup nxgG n V n {x) < oo. 

We comment that the above result is, in general, not valid for V n on n~ l 5Z+, since V n changes 
abruptly near the portion d c G of the boundary. 

For each fixed u G U, the mapping m — ► p(u,m), when restricted to fli such that Ui > 0, 
is strictly convex with compact level sets. We conclude that the infimum over m in the DPE is 
achieved, and denote such a point by m n (x, u). Although part 1 of the following lemma is not used 
elsewhere, it indicates why the Isaacs condition should hold in ((7J). 

Lemma 3 Let Conditional hold. Then 

1. m n (x,u) can be chosen independently of u, and 

2. there is bo < oo such that for all x, n and u, m n (x, u) < &o- 

We introduce two parametric variations of the game defined in Section |2j The first will be 
associated with domain perturbation (parameterized by the symbol a), and the second with a 
bound on the perturbed rates (parameterized by the symbol b). 

For some fixed ao > 0, consider perturbations G a , a G (— ao,«o) of the domain G defined as 
follows. If G satisfies Condition then G a is defined as G, but with Zi replaced by Zi + a, 
i = 1, . . . , J. If G is as in Condition^2, then G a is defined as G, but where <f>i is replaced by fa + a, 
i G J+. 

For any b G (0, oo), let M b = [0, b] 2J . Analogously to the definition © of M, let 

M b = {m = (\ 1 ,...,\ J , /xi, . . . , pLj) : [0, oo) — ► M ; m is measurable}. (8) 
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Strategies and values for the game are then defined analogously to the way strategies and values 
are defined for the original game, using M b in place of M. It will be convenient to set M°° = M 
and M°° = M, and to refer to the original game of Section |2] as the case b = oo. 

The cost, sets of strategies, lower and upper values of the games resulting by the introduction 
of the parameters a and b will be denoted as C a (x,u,m), A b , B b , V b ~ and V b ' + . When a = 
[resp., b = oo], the dependence on a [b] will be eliminated from the notation, as in V~, V b, ~ . 

Let bo be as in Lemma |31 Denote 

b* = max{6 , Xi,Pi,i = 1, . . . , J} + 1. (9) 



Lemma 4 Assume Condition^ Then 

1. dist(<9 co G a , d co G) = inf{|x -y\:x€ d co G a , y G d co G} > if < \a\ < a ; 

2. the values are bounded on G, uniformly for b € [6*, oo], and there is a constant cq such 
that for any x G G, \a\ < e (where e depends on x), and b G [6*,oo], one has \V b ~{x) — 
V b ' (x)\ < c \a\ and \V b ' + (x) - V b ' + (x)\ < c \a\. 

The following lemma shows that any nearly optimal strategy for the minimizing player will 
satisfy a uniform upper bound on the integrated running cost. Moreover, there is a finite time To 
such that for each such minimizing strategy, any open loop control used by the maximizing player 
leads to exit by To. Similarly, given any strategy for the maximizing player the minimizing player 
can restrict to open loop controls that force exit by To. 

Lemma 5 Fix b G [6*,oo]. Given (3 G B b , write (A;(-) , &(•)) = (3[u}(-). For z,T > let B Z > T 
denote the set of [3 G B b which satisfy 

[ V[AJ(Ai(t)/Ai) + Ui(t)nil(fli(t)/fii))dt < z, 
Jo i 

for all u G U. For a G A b , let M(a,T) denote the set of m G M for which a(x,a[m],m) < T. 
Then there are constants zq,Tq > such that 

ruATo 

V~{x)= inf sup/ [c + p(u(t),(3[u](t))]dt, 

/3GB Z 0.T u&t y Jo 

and 

raATo 

V + (x) = sup inf / [c + p(a[m](t),m(t))]dt. 

a£AmGM(a,T ) Jo 

In the rest of the section the strategies (3 will be assumed (without loss) to be in B Z °' T °, where 
Zq, Tq are as in LemmaEl and are fixed throughout. Also, m G M will be assumed to be in M(a, To) 
whenever it is clear which a is considered. With an abuse of notation, we denote B Z °> T ° by B. 
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Lemma 6 Under Condition^ V b > and V b,+ are Lipschitz on G, uniformly for b € [6*,oo]. 

We are now ready to prove the following result. 

Theorem 3 Let Condition^ hold, and let b € [b*,oo). Then for any x G G and any {x n } converg- 
ing to x (with x n S G n ), 

V b ' + (x) < liminf V n (x n ) < limsup V n (x n ) < V b '-{x). 

n >oo n — >CQ 

Proof of Theorem |3J The result is established by considering a sequence of stochastic processes, 
defined using the constrained ODEs, but for which the controls u and m are governed by, on one 
hand, a nearly optimal strategy for the game, and on the other hand, a nearly optimal control for 
the stochastic control problem. The technique uses standard martingale estimates, and is based on 
the construction (deferred to Section 0) of an auxiliary controlled Markov process that is controlled 
by the selected strategy and stochastic control. 

Upper bound 

Fix 6 € [b* , oo). The dependence on b will be suppressed in the notation for V~ , V~, etc. We first 
show that 

limsup V n (x n ) < V'(x). (10) 
According to Lemma HJ 2, it is enough to show that for all a > 

limsupF n (x n ) < V~(x). 

n^oo 

Let (3 € B b and a > be given, and set C a (x, (3) = sup n6 f; C a (x,u, (3[u\). It is enough to show that 

limsup y n (x n ) < C a (x,f3), a > 0. (11) 

n^oo 

We therefore fix (3 throughout, and turn to prove 1)1 1(1 - We can assume without loss that 

C a (x,/3)<oo. (12) 

Note that in the DPE Q the supremum is with respect to u in a compact set U, and that the 
function being maximized is continuous in u (for each y). Let u n (y) denote a point where it is 
achieved. Then for any m € M and y G G n , 

< C^iylmyn^ + c + m ). ( 13 ) 

Lemma 7 Let n be fixed, and let b € [b*,oo), (3 and x n be as above. Then there is a filtered 
probability space (Q,F,(Ft),P), and Ft-adapted ROLL processes X n , Y n and m n such that with 
P-probability one m n {t) = P[u n ]{t) a.e. t, u n {t) = u n (X n (t)), X n = T(Y n ), X n (0) = Y n {0) = x n , 
and for any f 

f{X n {t)) - j C n ' an ' m " (s) f{X n {s))ds 
J o 
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f(Y n (t)) 



are (Ft) -martingales. Moreover, with Tq as in Lemma\^let N n denote the total number of jumps 
ofY n on [0,T ]. Then 

EN n < 2JT bn. (14) 



Proof: See Section El 

Returning to the proof of lfTT|) . let u n (t) = u n (X n (t)) and let a n be the first exit time of X n 
from G. Combining f|13|) and Lemma [7J for any bounded stopping time S < a n , 



V n (X n (S)) + / [c + p(u n (s),P[u n ]( S ))]d S 



V n (Xn) < E Xn 



Denoting P[u n ](t) = {(Xf (t), flf (t))}, define 4> n as <p n = r(^ n ), where 

^(t) = x + f ' v(u n ,P[u n })ds, 



(15) 



and let 

a n a = inf{t : ^(t) G a }. 
Then the definition of C a (x,f3) implies 

[c + p(u n (s),P[u n ](s))]ds<C a (x,/3). 

Apply (JEJ) with S = &2 A a n AT. If T is sufficiently large, then (Jl2j) and the fact that c> imply 
&2 < T. Thus, using E Xn V n (X n (a n ))) = 0, 



^ n (x„) < E Xn 



r(x"K))i R<5 „ } + 1 [ c + P (^( s ),/3n( s ))]d s 



Again using the uniform boundedness of V n (x) (Lemma |2J), there is a constant 62 < 00 such that 
for all n 

V n (x n ) < b 2 P Xn (&2 < a n ) + C a (x,f3). (16) 

In what follows, we show that P Xn (o™ < a n ) tends to zero. To this end, note that £™' M,m id(y) = 
J2i \vi + J2i u-ifiiVii where id is the identity map. Therefore, using again Lemma Q 



Y n (t)-x n 



Li=l i=l 



ds + rf(t), 



where rf 1 is a zero mean martingale. To prove that 

sup |?7 n (£)| — > in distribution, 

te[o,T] 

it is enough, by Doob's maximal inequality, to show that 

E\rj n (T)\ 2 — > 0. 



(17) 
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Let [x](t) = J2s€[o,t] |Ax s | 2 . By the Burkholder-Davies-Gundy inequality (see 0, VII. 92), 

% n (T)| 2 < Cl £[rf](T), 

where c\ is a constant. Since each jump is bounded by C2fi~ 1 (02 a constant) and the total number 
of jumps N n (T) satisfies lfT3|) . 

E\rf{T)\ 2 < c 3 n- 2 EN n (T) < ^n" 1 , 

which proves (|17|) , This implies that sup[ y] \Y n {t) — ip n (t)\ —* in distribution, and therefore the 
continuity of T implies sup[ 0i r] \X n {t) — 4> n {t)\ — > in distribution. By LemmalUl, 

< P^f sup |X»(t)-0 n (t)| >&l), 

where b% > depends only on a. Hence by (|17|). -Pc^cx™ < <r n ] — ► as n — > oo. Therefore 1)16(1 
implies 

limsu P y"(x n ) < C a (x,/3). 
This gives 1)11)1 and completes the proof of (jl()j) . 
Lower bound 
Next we prove 

liminf y n (x n ) > V + (x). (18) 

By Lemma 0J2, it is enough to show that for all a < 

liminf V n (x n ) > V+(x). 
n— >oo 

Let a G 4 be given, and set C a (x, a) = inf mgi ^i, C a (x, a[m],m). Then it suffices to show 

liminf V n (x n ) > C a (x, a), a < 0. (19) 

n — >oo 

Fixing a, we now prove 1)19)1 . 

Interchanging the order of infimum and supremum in equation (J7J) (see Corollary 37.3.2), 
and noting that the infimum over m is of a continuous function with compact level sets, we denote 
by m n (y) a point where the infimum is achieved. By Lemma |31 the components Xf(y) and ftf{y) 
of m n (y) are all bounded by 6q. For u £ U and y € G n , 

> £W*(v)y»(j,) + c + m «( y) ). (20) 



Lemma 8 Lei n 6e fixed, and let a and x n be as above. Then there is a filtered probability space 
(Cl,F,(Ft),P), and Ft-adapted RCLL processes X n , Y n and u n such that with P -probability one 
u n {t) = a[m n ](t) a.e. t, m n (t) = m n {X n {t)), X n = T{Y n ), X n (0) = Y n {0) = x n , and for any f, 

f(X n (t)) - f C n ' un ^ mn f(X n (s))ds 
J o 
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f(Y n (t))- C n ' un{s) ' mn f(Y n (s))ds 
Jo 



are (Ft) -martingales. 



Proof: See Section El 



Let m n (t) = m n (X n (t)) and let a n be the first exit time of X n from G. By (JUJ) and Lemma El 
for any bounded stopping time S < a n , 



V n (x n )>E Xn V n (X n (S))+ [c + p(a[m n ](s),m n (s))}ds 

Jo 

Denoting m n (t) = ((Xf (t), p£{t)), define (f> n as (j) n = T(i[j n ), where 

ip n = x + / v(u n ,m n )ds, 
Jo 

and let 

a n a =M{t:<t> n (t)^G a }. 
Then the definition of C a (x,a) implies 



(21) 



o 



[c + p(a[m n }(s),m n (s))}ds > C a (x,a). 



Apply (J2TJ with S = <r" A cr n A T, with large enough T, using the fact that V n > to get 

t™ Act™ AT 

/0 



V n (x n ) > E Xr 



> E,. 



[c + p(a[m n }(s),fh n (s))]ds 

S-"AT 



1 CT «< CT « / [c + p(a[m n ](s),m n (s))]ds 
o 



> P Xn (a2<a n )C a (x,a). 

The proof that Px n {&™ < o'") tends to one is analogous to the proof of the that P Xn (a2 < G n ) — > 
in the upper bound. It is therefore omitted. Hence 

liminfy n (3; n ) > C a (a;,a). 

n— >oo 

This gives (|19(l . and the proof of ((TH)) is established. ■ 

In fact, the value of the game is independent of b for large b, so that the game has a value with 
the unbounded action space M. As the result depends on Theorems El El El we postpone the proof 
to Section El 

Theorem 4 For all b € [b*, oo], V b < + = V+ = V b ~ = V~ . 



Proof: See Section El 
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4 The PDE 



In this section we show that the upper and lower values of the game are the unique Lipschitz 
viscosity solutions of the PDE (j23[) . Throughout, the parameter b G [b* , oo) is fixed. Let 

H(q) = inf sup[(g, v (u, m)) + p(u, m) + c]. (22) 

m u 

It will be useful to note that the infimum is over the compact set M b , and the map (q,u,m) i— > 
[(q,v(u,m)} + p(u,m) + c] is continuous. The PDE of interest is 

H(DV(x)) = 0, x G G°, 

<W(aO,7i> = 0, i G /(x), x G (23) 
V(x) = 0, x G d G. 

Here, 7j are the directions of constraint that were introduced in Section 2. 

Definition 2 Let a Lipschitz continuous function u : X — > IR be given (where X C G). We say 
that u is a subsolution [respectively, supers olution] to \23\) on X if the following conditions hold. 
Let 6 : X — > IR be continuously differ entiable on X. Let y G X be a local maximum [minimum] of 
the map x i— > u(x) — 6(x). Then 

H{D6(y)) V max <£>%), 7i ) > 0, (24) 
[ H(D9(y)) A mm <£>%), 7i ) < 0, ] (25) 

and 

7(x) < o, ieln (26) 

[ V(x) > 0, x G X n 9 G. ] (27) 

We say that V is a viscosity solution to \2^) . if it is both a subsolution on G and a supersolution 
onG\ d c G. 

Remark: In case that d c G ^ 0, a viscosity solution is often called a constrained viscosity solution 
(cf. Soner [^S], Capuzzo-Dolcetta and Lions 8 ). The requirement that V is a subsolution up to 
the boundary d c G — the part of the boundary where exit can be unilaterally blocked — serves as 
a boundary condition on this part of the boundary. Note that in the current paper, the term 
'constrained' refers to the part d + G of the boundary, where it is the mechanism associated with 
the Skorokhod Problem that constrains the dynamics to G. I 

First, we address uniqueness of solutions to (|23]> . 

Theorem 5 Let u be a subsolution and v a supersolution to \2S\) . Then u < v on G. 

The proof combines ideas from two sources, namely (which is based on ^H], and discusses 
how to deal with the constrained dynamics on d+G), and [2Hj (to accommodate the fact that under 
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Conditional part of the boundary (d c G) can be thought of as imposing a state-space constraint 
on the maximizing player). 

The following lemma will be used in proving Theorem 03 In the interest of consistency with 
previous publications, we use B in Lemma El below to denote a certain subset of IR/ (although 
everywhere except in this section, B denotes a set of strategies). Part 1 states that the "Set B" 
Condition holds, namely a condition under which it was proved in [K| that the SM enjoys the 
regularity property (jlj. The proof that this condition holds in the current setting can be found in 
|17j . The existence of a smooth version of the set B is proved in (before Lemma 2.1). For Parts 
2 and 3, see Lemmas 2.1 and 2.2 of (note that the condition that ji are independent holds). 



Lemma 9 1. There exists a compact, convex, and symmetric set B C 1R^ with G B°, such 
that if z G dB and if n is an outward normal to B at z, then for all i G {1, . . . , J} 

{z,ei) > —1 implies (7i,n) > and {z, ej) < 1 implies (7«,n) < 0. 

In addition, the unit outward normal n{x) to B at x G dB is unique and continuous (as a 
function on dB ). 

2. Let n be the extension of n to JR J satisfying n(x) = n(y) whenever ax = y G dB, some 
a G (0, oo) (and define n(0) arbitrarily). Let E : IR J — > IR + be defined via 

E(x) = a 44> x G d{aB) 

for all a G [0, oo), and let £(x) = (H(x)) 2 . Then there exist constants m,M G (0, oo) and a 
function g : 1R/ — > [m, M] such that the C 1 (iR J ) function £ satisfies m||x|| 2 < £(x) < M||x|| 2 , 
and D£(x) = g(x)E(x)n(x) . 

3. There exists a constant m± G (0, oo) and a continuously differentiable function fj, : 1R^ — > 
[0, mi] such that \\D[i\\ < mi on and 

(Dfi(x),~fi) < 0, x G IR+, i£l(x). 



In what follows, we keep the notation of Lemma|21 for B, n, S, £, £> and //. 
Proof of Theorem |5j For a > 0, let 

C(x) = u(x) — a/j,(x), 
V{x) = v{x) + a/i(x). 

Let 5 > 0. Then it suffices to show that for all small a > 0, 5 > 0, one has J7 < (1 + <5)y on G. 
Arguing by contradiction, we assume that this is not true. Then there are a and 5 arbitrarily small 
such that 

p = sup[U{x) - (1 + 8)V(x)] > 0. 
Below we let Cj,i = 1,2,... denote positive constants. Consider Conditional first. Let 

<D(x, y) = U(x) - (1 + S)V(y) - ~£(x - y - e l ' 2 y). (28) 
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Let (x, y) € G 2 achieve the maximum of in G x G. By continuity of U and V, there exists z € G 
so that p = U{z) - (1 + S)V{z). Note that 

(1 + e 1 / 2 )" 1 ! £ G. (29) 

Hence by the Lipschitz continuity of V, 

$(z,y) > 



= ^)-( 1+ ^(tt^) 

> p-cie 1/2 . (30) 
By Lipschitz continuity of U and the lower bound on £ given in Lemma El 

$(x,y) = C/(x) - (1 + 5)V{y) - -£(x - y - e l l 2 y) 

€ 

< U{y) + c 2 \x - y\ - (1 + <5)F(y) - -\ x - y - e^ 2 y\ 2 

€ 

< p + c 2 \x - y\ - ^\x - y - e l / 2 y\ 2 . (31) 

By (EH) and (|3Tj). 

c 2 \x-y\ + c l e 1 / 2 > ^\x - y - e 1/2 y\ 2 (32) 

v c 4i- -i2 i-|2 

> — \x - y - c 4 y . 
e 

Since x and y are bounded, (|32j) implies \x — y\ 2 < c^e and so 

\x-y\ <c 6 e^ 2 . (33) 

Using this in (|32j) we have 

|x - y - e 1/2 y\ < c 7 e 3/4 . (34) 

By (|3*3*|). x — y^OaseJ.0. Also, we claim that for all e > small, x and y are bounded away from 
d Q G. To see this, assume the contrary. Then along a subsequence, both x and y must converge 
to the same point on d Q G. Using the continuity of u and v, (|2fi )) -(|27j ) . and the non-negativity of 
£, limsup<3?(x,y) < limsup[u(x) — (1 + 5)v(y)] < 0, where the limit superior is taken along this 
subsequence. However, by (|30[) . for all small e, $(x,y) > p/2 > 0, which gives a contradiction. 

Let 

0(x) = -£(x - y - e 1/2 y) + a/i(x), 

and note that the map x *— > u(x) — #(x) has a maximum at i £ G. Since u is a subsolution, ()24j) 
must be satisfied at x. Denoting 

q e = Q (x -y - e 1/2 y)E(x - y - e 1/2 y)n(x - y - e 1/2 y), (35) 

we have from Lemma 02 that 

D6{x) = ~q e + aDpi{x). 
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Suppose i is such that Xi = 0. Then (x — y — e 1//2 y, &i) < and so by Lemma EJl, 

{■ji,fi(x -y- e 1/2 y)} < 0. 
Since by Lemma|Hl3, (7*, D 'fi(x)) < 0, it follows that (ji,D9(x)) < 0. It follows from (J2U) that 

H(D6(x)) > 0, 

namely, 

H Q<f + aDn(x)j > 0. (36) 

On the other hand, let 

a(y) = -an(y) - £(x -y- e 1/2 y). 

e(l + S) 

Note that 

1 + e 1 / 2 

Da(y) = -aDfi(y) + <f 
e(l + d) 

and that the map y i— » v(y) — a(y) has a minimum at y. Since x £ G, (|34|) implies that y € G \ <9 C G 
for all small e. Since f is a supersolution, (|25|) is satisfied at y. An argument as above shows that 

H(Da(y)) < 0, 

and therefore 

H (tTs {^7^^ ~ a{1 + V D ^) ) ^ °- 

It follows from the definition of H , using p(u, m) > 0, that 
and therefore 

# Q<? £ + 7172 ? ~ < l + ^)^(y)) + ^ < 0. (37) 

Now D/x is bounded, and by Q34JI . boundedness of n and and the Lipschitz continuity of H, it 
follows that e~ l l 2 q e converges to zero as e — ► 0. Note that by (|22|) and the following comment, H is 
uniformly continuous on IR^. Therefore, ()36|) and (|37f) give a contradiction when a > and e > 
are small and 5 > fixed. 

Under Condition 02 the above argument is not valid, since (|29|) may not hold. However, in 
this case the minimizing player can force exit from any point on dG\d+G, and the additional 
complications due to the "state-space constraint" used under part 1 are no longer needed. In other 
words, instead of (|28j) we can consider 

*(x, y) = U(x) - (1 + 5)V(y) - ~£(x - y), 

and a review of the above proof shows that (J36|) and (|37|) still hold if the expression e 1//2 is replaced 
by zero everywhere in (|35|) and 1)3 7|) . A contradiction is then obtained analogously. ■ 

We next consider the upper and lower values of the game, and remind the reader that in this 
section the rates m are assumed bounded. 
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Theorem 6 V and V + are solutions to \2'A) . 
Recall that from Lemma El are Lipschitz. 

Proof of Theorem |HJ We use the specific form of v(u,m) and p(u,m). These can be written as 

J 

v(u, m) = b (m) + ^ Uibi(m), 
i=i 

J 

p(u, m) = c (m) + ^2 UiCi(m). 

i=l 

We have J2iec(k) u i ^ 1- The bi are linear, and the q are convex. Hence, as a direct consequence 
of |23l Corollary 37.3.2], the Isaacs condition holds, namely 

H(q) = inf sup[(q, v (u, m)) + c + p(u, m)] = supinf[(q, v(u, m)) + c + p(u, m)\. (38) 

m u u m 

Another fact that we will use is that for any y € G \ d c G there is 5q = So(y) > which serves as a 
lower bound on the exit time. Namely, if <ft solves (f) = n((p,v(u,m)), 0(0) = y, then 

a = inf{t > : <f>(t) £G}>5 , u£U,m£M. (39) 

The bound is an immediate consequence of the u and m being uniformly bounded. 

By definition, V^(x) = for x G d Q G. Thus we only need to establish (|24j l — 1|25 |) . The proof 
consists of four parts. 

Proof that V~ is a supersolution on G\d c G. 



Standard dynamic programming arguments show that for 5 > 0, 

tAS 



V (x) = inf sup 

P u 



(c + p(u,P[u]))dt + V-(^{a Ad)) 



(40) 



where (ft is the solution to (ft = ir((ft, v(u, j3[u])), with (ft(0) = x. Let 6 be smooth, and let y € G\d c G 
be a local minimum of V" — 9. We can assume without loss that V~{y) = 9{y). We need to show 

H(D9(y)) A min (D9(y), 7i ) < 0. (41) 

We shall assume the contrary and reach a contradiction. Thus, there exists a > such that 
H{D9(y)) > a, and 

(D9(y), 7i )>a, i G I(y). (42) 

From the definition of H and (|38|) . 

sup inf [(D 9 (y),v(u, m)) + c + p(n, m)] > a, 

u m 

and therefore there exists a uq such that for all m, 

(D9(y),v(u ,m)) + c + p(u , m) > a/2. 
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For any strategy /3, if u(t) = uq, 



(D9(y),v(u(t),P[u}(t)))+c + p(u(t),f3[u](t))>a/2 (43) 

for all t. Let (f> denote the dynamics corresponding to u and a generic /3, starting from y. Note that 
the mapping z i— > I(z) is upper semi-continuous, in the sense that for any z there is a neighborhood 
of z on which /(•) C I(z). Using the boundedness of m this implies that for any f3 £ B, one has 
I((p(r)) C I(y) for r 6 [0, 6], if 5 > is chosen small enough. We now use that (f> is & solution to the 
SP. Choosing such a 5 > 0, for any r € [0, 5] there exist a, > (that may depend on r) such that 

0( r ) = v(u(r),(3[u](r)) + ^ a^. 

iei( y ) 

Using the continuity of DO and taking 5 > smaller if necessary, (jSZjl and (|4T?|) imply, for t € [0, 5], 

> -c- p(u(t),p[u](t))+a/4. 
Taking 5 even smaller if necessary (so that it is at most <5o), we have from (|39|) that 

0{<t>{5)) ~ HV) > - /V + PKt), P[u](t)))dt + a5/4. 
From ()4U() . one can find a /? such that 

V (y) > sup / 

u JO 







(c + p(n, /3[«]))ttt + V~ {<f>(6)) - aS/8 



Letting u = u, the last two displays give (using 8(y) = V (y)) 

0{<f>(S))>V-i<f>(S))+<*8/8, 

so that V~~((/)(5)) — 6 ((f)(5)) < for all 5 > small, contradicting the assumption that y is a local 
minimum of V~ — 0. This proves that V~ is a supersolution on G\d c G. 

Proof that V~ is a subsolution on G. 

Let 6 be smooth and y £ G a local maximum of U~ — 0. In case that y G d c G, let Uy t /3 } s be 
the set of controls u E U for which the trajectory determined by « and (5\v\ and starting from 
y does not exit G on [0, 5]. Given y £ d c G, it is clear that U y ,p,s is not empty for all 5 small and 
all (3, by considering the control u = 0. Moreover, for all 5 small enough, ()40|) is valid where the 
supremum extends only over u G U y ,/3,s- Indeed, given u U Vj p,s^ consider u' that agrees with u on 
[0, a] and v! = on (cr, 5]. Then the expression in brackets in (|4UI) is identical under it and under 
u' , but u' G Uy t /3 t s- 

Assume without loss that V~(y) = 9(y). We would like to show that 

H(D9(y)) V max (D9(y),^) > 0. (44) 
iei( y ) 
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Assuming the contrary, there exists a > such that H{D9(y)) < —a, and 

(£>%), 7i) < i€l(y). (45) 

Using the definition of H and (|38jl. for all u there exists m u such that 

(D6(y),v{u, m u )) +c + p{u, m u ) < -a/2. (46) 

Note that it is possible to choose m u so that it depends continuously on u. Define /3 as = 
m u ( t ) for all t. Since is measurable if u is, /3 maps U into M. Let be the trajectory 
corresponding to (3 and a generic u £ U, (or a generic it € U y ,p,8 if 2/ £ <9 C G) starting from y. 
Arguing as before by upper semi-continuity of /(•), if 5 is small enough, then 

<j>(r)=v(u(r)J[u](r))+ ^ 0*7*. ^€[0,(5], 

where Oj > may depend on r. By possibly taking 5 smaller, and smaller than So, we have, using 
the continuity of DO and flI5J), (06) that 

< -c-p(u(t)J[u](t))-a/4, 



and 



0(0(5)) - 9{y) < - f\c + p(u(t)J[u}(t)))dt - a(5/4. 



Now, (|4U|) implies that for any (5 there is u such that 

V-{y) < (\c + p(u, (3[u]))dt + V-(cf>(6)) + ad/S. 
Jo 

Specializing to (3 = /3, the last two displays show that V~((j)(5)) — 9{(j){8)) > for all 5 > small. 
This contradicts the assumption that y is a local maximum of V~ — 9, and as a result, U~ is a 
subsolution. 

Proof that V + is a supersolution on G\d c G. 

The proof is analogous to the proof that V~ is a subsolution. Most details are therefore skipped. 
The dynamic programming principle states that for 5 > 0, 



V + (x) = supinf 



crA<5 

(c + p(a[m], m))dt + V + ((p(a A 5)) 



(47) 



where 4> is the dynamics corresponding to a and m, starting from x. Taking a smooth 9, and leting 
y € G\ d c G be a local minimum of V + — 9, showing 

#(£>%)) A min(L>%), 7i ) <0 
can be obtained by an argument analogous to that used to prove (|4*4*|) . using (|4*7j) in place of (|4T)|) . 



24 



Proof that V + is a subsolution on G. 



We need to show that 



H{D9{y)) V max (£>%), 7i ) > 0, 



iei( y ) 



(48) 



where 6 is smooth, and y £ G is a local maximum of V + — 9. In the special case where y £ 3 C G, we 
can assume without loss that the supremum in (|T7|) extends only over a £ .A^ , the set of strategies 
under which, for any m G M ft , the dynamics associated with a and m, and starting from y, does 
not leave G before 6. The proof of ([!%)) is analogous to the proof of (|41|). and is skipped. 

This completes the proof that U~ and V + are solutions to (|2M|) . ■ 

5 A competing queues example 

Consider a queueing network with only one server, providing service to J classes. Each customer 
requires service once. In this example all arrival rates are positive: A« > for all i, hence J7+ = 
{1, . . . , J}. This network, "the k competing queues," has been studied extensively, in discrete and 
continuous time (see [31201 and references therein). When the criterion (to be minimized) is either 
the average cost or the discounted cost, and the one-step cost is a positive linear combination 
J2i CiXi of the queue sizes Xi, the optimal policy is the //-c rule, which is a priority discipline, giving 
absolute priority to the non-empty queue for which /ijQ is maximal. Under the cost studied here, 
the optimal policy is quite different. 

Proposition 1 Consider the case where G is a hyper-rectangle, given as G = {x : < x < z{\, 
where Zi > are constants. Assume that Aj > for all i = 1, . . . , J. If c is large enough, then the 
viscosity solution to the PDE h2'J\) is given as 



where an > are constants depending on c. 

We remark that the constants aj are uniquely defined by (|51[) below. In the totally symmetric case, 
where fii = Aj = A, Z{ = z for all i, the solution takes the form V(x) = aminj(z — Xi). In this 
case, the optimal service discipline can be interpreted as "serve the longest queue." An asymmetric 
two dimensional example is given in Figure 121 where the domain G is divided into two subdomains 
G\ and G2 in accordance with the structure ()49|) . and the optimal service discipline corresponds to 
giving priority to class i when the state is within Gj, i = 1,2. Thus the optimal control under our 
escape-time criterion is very different from the optimal controls for the average or discounted cost 
criteria. 

Proof: The constraint directions are given by ji = e,. The Hamiltonian is given by 



V(x) = mmai(zi - Xi), 



(49) 



H(p) = sup inf H (p, u, m) 



u 
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where 



□ 
□ 




Figure 2: Priority to class % when the state is in Gj, i = 1,2. 



H(p, u,m) = c + 



Using strict convexity and smoothness of the map m i— » iJ (p, u, m), the minimum over m is attained 
at Xi = Xi,e~ Pi , pi = Hie Pi . Thus 

H(p,u) = MH(p,u,m) = c + V[Ai(l - e~^) + u ifM (l - e pi )]. 

m. ' * 



For the proposed solution, DV(x) € — IR^ wherever the gradient is defined. For p £ — Uli, 
maximizing (j>, u) over u clearly gives 



H(p) = sup H(p,u) = c + J^ Aj(l - e Pi ) + max/jj(l - e n 



(50) 



We use the well known fact that the definition of viscosity solutions can be equivalently stated in 
terms of sub- and superdifferentials (see [B], Lemma II. 1.7). Note that (|26|) and (|27j) hold, since 
V = on d Q G. Hence it suffices to verify that (|24|) [resp., (f23|) ] holds where D6{y) is replaced by 
any superdifferential [subdifferential] of V at y. 

We show first that the equation H(DV{x)) = holds wherever DV is defined. The proposed 
form satisfies DV(x) = —c^ei, wherever the gradient is defined, with i = i x depending on x. 
By the special form of the gradient, the equation H{DV{x)) = takes the form 

H{DV{x)) = c + Ai(l - e Ql ) + W (l - e" Ql ) = 0, (51) 

where i = i x . Denote q = c/ (Aj + fii). Then equivalently, 1 + Cj — i^(aij) = 0, where 

Xie ai + me~ ai 



Fife) 



Xi + /Xj 



The function Fi is strictly convex, Fi(0) = 1, and Fi{a.i) — > oo as «j — > oo. Since c, > 0, it follows 
that there are unique positive constants on where Fi{oii) = 1 + q, i = 1, . . . , J. These are the 
constants in (|49|). In particular, 1)51(1 holds for ? = z^, and H(DV(x)) = 0. 



Next consider any interior point x at which the gradient is not defined. Clearly there are no 
subdifferentials at that point, and any superdifferential is given as a convex combination of — a^, 
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i = 1, . . . , J. Let B(ei,e) be the open ball of radius e about ej. Denote S = € IR^ : V{ > 
0,E^i = 1}, S = {^E J :!'i>0,Ei'i<l},S e = Sn U^e^e), and S% = S - S e . Let 
g = — J2i u i a i e i- It suffices to show that H(q) > for v E 5, but since we later need a stronger 
statement than that, we show that in fact H(q) > holds for v £ S. By (|5()|). 

#( g ) = c + V Ai(l - e^*) + max ^(1 - e~^). 

— * i 

i 

Define 

H\q) = c + Y J Ml - e^) + - e~™), 

i 

(g) = c + X)[Ai(l - e^) + - e"^)]- 

i 

By c + Aj(l — e ai ) + /Uj > and c + Aj(l — e Qi ) < 0, and it follows that there are constants 
Ai, A2, A3 and A4 such that for all c and i = 1, . . . , J, 

+ log(c + A 2 ) <a { <A 3 + log(c + A4). (52) 

We first consider small perturbations v of e\. To show that -ff(g) > 0, it suffices to show that 
H 1 (q) > 0. Note that JSU implies il^g)],,^ = 0. Also, 

V I/ -ff 1 (g)| v =ei = (-Aiaie" 1 + maie~ ai )ei A^e;. 

Hence, for 7 = — e% (where i ^ 1), using c+ Ai(l — e Ql ) < 0, (fBTj) and l|52j). 

V I ,-ff 1 (g)| i y= ei • 7 = Ai«ie Ql - ^iQie" ai - 

= ai(2Aie ai — /ii — c — Ai) — a^Aj 

> Ql(c - /Ul) - OjAj 

> [^1 + log(c + A 2 )](c - m) - [A 3 + log(c + A 4 )]\i 

> 1, 

for all c large. Analogous calculations give V u H 1 (q)\ v=ei • 7 > 1 for 7 = — ei as well. As a 
result, the directional derivatives (d/d ; y)H 1 (q)\ v=ei in the direction 7, where 7 are of the form 
7 = (y — ei)/\\y — ei||, y € 5, are bounded below by 1/2. Hence H l {q) > for u £ S within a 
neighborhood of ei and c large. Consequently, a similar statement holds for H(q). Since the same 
argument holds for neighborhoods of a, i = 2, . . . , J, we conclude that there is e > and cq such 
that H{q) > for ^ € 5 e and c > Co- 

Next consider f G S£. We first provide a lower bound on (d/dc)H(q). Differentiating (|51[) with 
respect to c, dj = dai/dc = (\e ai — fiie~ ai )~ 1 . Using (|52|) . for all c large, < dj < (A^e" 1 — l) -1 . 
Using this, the fact that v is bounded away from Uj{ej}, and by taking c large, one has 

^-H(q) > 1-^A^e^ 
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Note that the above bound holds for all c > c\ and all v € S^, where c\ is a constant. It follows 
that there is C2 such that for all c > C2 and all v € S^, one has H (q) > J2i f^i- Since H > H — J2i Vi, 
H{q) > 0. We conclude that H{q) > for all v G S. In particular, H(q) > where q is any 
superdifferential of V at any interior point. 

Finally, consider a point x € G Pi dJR"L. Any superdifferential of V at x is given as q = 
Z)ie/(x) Vi e i ~ Ylj=i u j a j e ji where 77, > 0. If max ieI ^ (q, ji) > 0, then (|24|) holds. Otherwise, 
(?>7i) < f° r an * £ I(;c). Consequently, any g of the form above is given as —J2j=i v 'j a j e ji with 
v' € S. As we have shown, in this case, H{q) > 0. Therefore (|24|) holds. 

Similarly, any sub differential of V at a; G G D 9IR+ is of the form — X^e/(ic) ~~ X)/=i v j a j e j- 
In particular, (q, e$) < for all i, and (|25|) holds. ■ 



6 Proofs of lemmas 



Proof of Lemma ^ Let 



W n (x) =inf££' n e- 



Since c > 0, W n is well defined. Standard iterative methods can be used to construct a solution to 
the DPE 



= inf 

ueU 



C n ' u W n (x) - ncW n (x) 



,i£ G r 



(53) 



and the boundary condition W n (x) = 1 if x ^ G n . We claim that this solution coincides with the 
risk-sensitive cost. To see this, consider a controlled Markov process (X n ,u) that starts at x. Then 

= I^ n (X n (£)) - W n {x) - f £ n > u(s) W n (X n (s))ds 

Jo 

is a martingale. Equation jSBJ) implies £ n ' M ^W n (X n (s)) > nciy n (X n (s)), and so 



W n (X n (i)) - W n (x) - / ncW n (X n (s))ds= / Z(s)ds + y(t) 

Jo Jo 

for some nonnegative process Z. Using Gronwall's lemma we obtain that for each t < oo 

E^ u W n {X n {t A a n ))e- nc(iACTn) > T^ n (x), 
and by the Lebesgue Dominated Convergence Theorem 



n,u „—nca" 



> W n (x) 



If we define u in terms of the feedback control that minimizes in (|53|) then all the inequalities above 
become equalities, thus showing that W n = W n . 

The definition of W n implies W n (x) = exp [— nV n (x)] . If we insert this into the DPE of W n 
and multiply by exp [nV^a;)] then the equation 







inf 



22 ( ex P 



-nV n [x+-v j )+ nV n (x) 



28 



+ ^2 nfJ-iUi I exp 



-nV n ( x + -tt(x, Vi) ) + nV n (x) 



nc 



results. Recall the definition l(x) = xlogx — x + 1 for x > 0. We now divide throughout by n and 
use the convex duality relation 

[e y — 1] = sup[xy — l(x)] 

to represent the terms in the previous display. For example, in the sum on j we take x = Aj/Aj 



and y 



nV n ( 



x + n v i 



nV n 0) 



Representing each term in this way and multiplying by —1 
produces the first line in (JJJ) . The boundary condition that is the second line in (JJJ) follows directly 
from the relation between W n and V n . M 

Proof of Lemma [2} We reduce the Lipschitz property on (n -1 ^^) H G to a Lipschitz property 
near the boundary. To this end we use the following coupling. For z 6 G n , let u n (z) be a minimizer 
in Given a point x on the lattice, let X x denote the process corresponding to the generator 

£n,u an( j s t ar ti n g a i x ( see th e discussion following (O)- To simplify the notation we will not 
explicitly denote the dependence of quantities such as X x on n. Let u(t) = u n (X x (t)), and let Ft 
be the filtration generated by X x . 

Fix a point y ^ x and let X y denote the queueing process on this probability space that starts 
at y and uses the control u. In other words, X y is the image, under the Skorokhod map, of 
y + X x (-) — x. The evolution of the processes X x and X v are identical, save that jumps which 
would cause X y to leave Z{ are deleted. Automatically, u is suboptimal for the control problem 
starting from y. Define 

V n (y;u) = -n- 1 log E%' n e- ncaV (54) 



where a y is the exit time of X y from G. Note that due to the coupling we may take expectations 
with respect to E^' n rather then with respect to Ey' n . Since (X y ,u) is a (possibly suboptimal) 
controlled Markov process, we have 



V n {x) - V n {y) < V n {x) - V n {y-u). 



(55) 



Define a = mm{a x ,a y }. By Theorem ^ on the Lipschitz continuity of the Skorokhod map we 



have 



dist(X x (a),dG) < Kx\x-y\, d\st(X y {o), dG) < K^x - y\, 



(56) 



since at least one of the processes has left G by a. In the last display, K\ is the constant appearing 
in We claim that 

V n {x) - V n {y) < sup{V n (z) :z£S} (57) 
where S = {z £ n^ 1 ^^ n G : dist(z,d co G) < K\\x — y\}. To establish this, note that 



V n {x)-V n (y,u) 



< 



1 r . 

1 r 

n 



logE^ n e- ncaX -\ogE^ n e- ncaV 

\ogE^ n [e- ncu El' n (e- n <° x -°) \X x {a))~\ - \ogE^ n e~ ncc 



< sup 



\ogE u x ' n [e- nca El' n (e- n <° x -°) \X x {a) = z)] - log 7 
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1 r 
sup 

zes n 

1 

sup 

zes n 



log [El' n ( e - nc{aX - a ) iX^a) = z) ^' n e- nc<T ] - logE^' n e- ncu 



However, by the strong Markov property, 

~ [log^' n ( e -™ c (^-^) \X x (a) = z) 
and together with (|55|) we have l|57|l. 



< sup 



n 



log (^' n e 



To prove the lemma, one needs to show that |V^,(x) — V n (y)\ < co\x — y\ for all n and all 
!,!/£ (n _1 ^+) fl G, where cq does not depend on x,y and n. It suffices to prove this inequality 
for x,y such that \x — y\ = n . Since the roles of x and y are symmetric, and in view of (|57|) . it 
suffices to show that for {x G G : dist(x,3 co G) < K\n }, 



V n (x) = -n-Moginf E^ n e~ ncaX < c x n 



-l 



(58) 



where c\ > is a constant. 



Let us first treat the case where G is not a rectangle. In that case, Condition ^ implies that for 
any x with dist(x, <9 co G) < K±n , 



there is i G J+ such that x + c'n $ G, 



(59) 



where c' is a constant. Let such i be fixed. To show (|58|) . it is enough to show that for any x such 
that dist(x, d co G) < Kin' 1 , and any n and u, 



> c 2 > 0. 



(60) 



Recall that Aj > 0. Let 5( denote the event that all service processes and all arrival processes, 
except for the one corresponding to i, do not increase on [0, t]. Recall that the expected time till a 
Poisson process of rate A hits level K is K/X. Then for any a G (0, 1) 



aP? 



> a) 
logo 



nc 



Choosing t = -(log a)/nc = 2c/nA; and using P^ n (a x < 2Ea x ) > 1/2, 



> aP«>>* < t \S t0 )P:> n (S, 



tt) J 



1 



-2cc/Ai 

2 



> e ~~'"*-C3 



where C3 > is the probability that a Poisson process with rate nc^ has not jumped by time 
to = 2c/Ajn. This proves ()60|) . which implies ()58|) . and hence the statement of the lemma holds. 

In the case where G is a rectangle, the bound ()57|) does not suffice since V n {x) is discontinuous 
near d c G. We therefore prove that a similar bound applies, where there supremum is over S = {z G 
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(n -1 ^^) n G : dist(z, d Q G) < K\\x — y\}. To apply the previous argument we need to show that if 
X x {t) is close to d c G, then neither X x nor X y will exit (locally) through that boundary. This is 
clear for X x : the only way for the process to leave G is due to a service to one of the queues, say, 
queue j, leading to an increase in queue i. However, allowing this service is certainly not optimal: 
it is better to avoid this control, as our objective is to increase a x . To prevent X y from exiting 
we need to modify the coupling argument as follows. The control u y used by X y avoids a jump 
that leads X y to exit through d c G (that is, queue j above will not be served if Xf(t) = z% — 1.) 
Note that this is the only possible type of jump that leads the process out of G. Moreover, the 
i\ distance between X x and X y may only decrease due to this change in control: the control is 
changed only if Xf (i) < Xf(t), and following the service Xf increases by 1 so that \Xf(t) — Xf(t)\ 
decreases by 1, while X x decreases by 1. 

Condition ^ still implies (|59j). but only for x such that 

Zi — X{ < for some i £ J + . 

For such x, the argument in the last paragraph holds. However, for x near d c G there is nothing to 
prove, since the process never exits through such a boundary. ■ 

Proof of Lemma |3J The first part is an immediate consequence of the fact that Ui > and both 
£n,u,myn^ x ^ anc j m ^ d e p enc i on u as ^\ mn, where rji is a function of mj, x, n but not of u. 

For the second part of the lemma, one can explicitly solve for m n in terms of V n , and get 

m n (x,u) = m n (x) = ((A™ (x)), (jlf (x))), where 

A? ( x ) = \ ie - nS * vn ( x \ ft = ^e-^ v "W, 

and 

SiV n (x) = V n {x + n- l Vi) - V n (x), 5iV n {x) = V n {x + n^^x, - V n {x). 

The result follows from Lemma EJ since it shows that there is a constant 62 independent of x,n 
where 

n5 l V n {x) > -6 2 , n~5iV n {x) > -b 2 . 

U 

Proof of Lemma |4j We fix b € [b* , 00] and suppress it from the notation throughout the proof. 
Item 1 of the lemma is trivial under Conditional. Under Condition ^2, by continuity of the 
functions (j>i, we only need to show is that d co G a and d co G do not intersect. Consider first a > 0, 
and let x G d co G a . Then Xj = a + 4>i(xi, • • • , Xi-i, • • • , xj) for some i € and therefore x 
cannot belong to the closure of G. The proof for a < is similar. 

Let (3o[u](t) = mo for all u,t, where mo sets all \ = Aj and fii = 0. Then p(u(t) , (3o[u](t)) is 
bounded by a constant, and the dynamics, unaffected by u, follow X(t) = x + J2i M^it and leave 
the bounded set G within a finite time bounded by diam(G)/ maxj Aj < 00. Therefore 

V~{ x ) < sup C(x, u, mo) = C(x,/?o) < c\ < +00. 

u 

Similarly, 

V + (x) < supC(x, a(mo), mo) < c\ < +00. 

a 
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It is useful to notice that for all a G (0, oo) and y G 9 Q G there is i = i y G J7+ such that 
y + 2aei y G" G a . Similarly, for all a G (— ao,0) and y G 3 D G a there is z = ^ G J7+ such that 
y + 2ae iy G. 



First consider a > and recall that <r a (resp., cr) is the exit time from G a (resp., G), so that for 
any fixed u and (5[u\ we have a < a a . Therefore, since c and p are positive, 

1 "is 



r<j a 

Va(x) = infsup/ (c + p{u(s),f3[u](s))ds 

P u Jo 

> infsup / (c + p(u(s), (3[u](s))ds 

P u Jo 



= V~(x). 

Thus to prove the Lipschitz property a one-sided bound suffices. Recall that <p(cr) is the exit point 
from G and for each j3 define the extension f3 a by 



P[u](t) t€[0,<r), 
m t G [a, oo), 



where rh sets all pj = and \j = lj=j^ (CT) • Then for any (3 
V~(x) = inf sup C a (x, P[u], u) 

P u 

< inf sup C a (x,(3 a [u],u) 

P u 



C(x, P[u], u) + / (c + p(u(s), m)di 

J a 

< V~{x) + cia, 



= inf sup 

P u 



where the last line follows since p(u(s), rh) is bounded and since by the previous paragraph, a a — a < 
2a. Note that c\ does not depend on b G [6*,oo]. For a < the same argument shows that 
V~(x) < V~{x) + cs\a\, by interchanging the roles of G and G a . 

For V a h (x) note that an argument as above gives V+(x) > V + (x). For each m define m a by 



m„ = 



m(t) te[0,<7), 
rh t G [cr, oo), 



where rh is as above. Let a e be an e-optimal strategy. Then, since for any fixed u and m we have 
o < cr a , 

Vj~(x) = sup inf C a (x, m, a [ml) 

a m 

< inf C a (x, m, a t \m\) + e 

m 

< inf C a (x,m a ,a e [m a ]) + e, 

m 

since we are taking the infimum over a smaller class of controls. By the definition of C a , 

Vq(x) < inf C(x, m, a e [m]) + / (c + p(a e [m a ](s), m(s)) (is + e 
< sup inf C(x, m, a[m]) + C2a + e 
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by the previous argument, where C2 does not depend on x, e and b. Since e is arbitrarily small, the 
proof for a > and V^(x) is established. ■ 

Proof of Lemma [5j We suppress b from the notation, throughout the proof. It is obvious that 
one can restrict the infimum over £ B to the class of strategies for which C(x, (3) < V~(x) + 1. 
Within this class, for every f3 and u, 



oc< \ [c + p(u(t),0[u](t))]dt <V-(x) + l, 
Jo 

and therefore one always has that a < Tq = (V(x) + l)/c. Lemma asserts an upper bound on the 
cost till time a fixed time To, and so we must define the strategy for times t £ [a, Tq]. Let rh be an 
arbitrary fixed element of M. Then the extended (3 is just 



(3[u](t) 



f3[u]{t) t<a. 
rh t > a, 



With this definition one has that C(x,u, (3[v\) = C(x, u, (3[u\). One can therefore further restrict 
to strategies (3 satisfying (3 = (3. For such (3, it follows that 

° P (u(t),f3[u](t))dt< Cl T , 

where c\ does not depend on u, (3. The result regarding V~ follows. 

Regarding V + , let tjiq be a control which sets all fii and Aj to zero, except that Aj = 1 for some 
io £ J+- Then for any a £ A and m for which 

C(x,a[m],m) < C(x,a[mo],rno), (61) 

one has ca(x,a[m],m) < C(x,a[m],m) < C(x, a[mo],mo) < c\ < oo. Note that c\ can be chosen 
independent of a, since the dynamics and running cost under mo are independent of a. Clearly, 
for each a it suffices to consider, in optimizing over m, only those m that satisfy It follows 

that it suffices to consider only those m for which a(x,a[m],m) < c\/c. This completes the proof 
of the lemma. ■ 

Proof of Lemma |6j Fix b £ [6*, oo] which we omit from the notation. Recall from Lemma 0] that 
V ± are bounded on G. We first show that V is Lipschitz. Assume first that Condition ^2 holds. 
Recall that for x £ G, 

V~{x) = inf sup C (x, f3[u],u). 

P u 

Let /3f be an e-optimal strategy starting from x, i.e., 

sup C(x, 01 [u] , u) < V ~ (x) + e. 

u 

For any z £ G let a z = inf{i : <f> z G}, where 4> z is the solution to 4> = 7r(0, v(u, (3f [u])), with 
0(0) = z. Note that C(x, u, = fg*[c + p(u(t), 01 [u](t))} dt (with possibly a x = oo). Now let 

y £ G. Note that on [0, cr x A cr^], one has by the Lipschitz property of the Skorokhod map that 
\4>x(t) — <f>y{t)\ ^ c i\ x ~ y\, where c\ is some constant. Recall that we are considering the case 
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of Condition ^2. Therefore, at a x A a y , both <p x and <p y are within a distance of c\\x — y\ of the 
boundary d Q G. Because of the assumptions on the domain G, there exists a constant C2 such that 
at time a x A a y , both <f> x + C2&i* G and <f> y + C2e%* $ G, where i* € and, moreover, C2 is 
independent of x,y,b > b* and i*. 

Define as /3f ,3/ [it] = (3f[u\ on [0, ovc), and, if a y > a x , set /?f' y [u] = ijiq on [c^Oy]. Here, 
mo sets all /2j and all Aj to zero, except that it sets Aj* = Aj* , where i* is as above. Consequently, 
&y < Ox + c'\x — y\ for some d > 0, and there exists w e such that 

V-(y) < su P C<(y,u,/?f>]) 

< r V [c + p(u e (s),l3*>y[u £ ](s))}ds + e 
Jo 

< r[c + p( Ue (s),l3*>y[u e }(s))]ds + l ax<ay r V [c + p(u e (s),l3^[u e ](s))}ds + e 

JO J cr x 

< C(x,u e ,f3f[u e }) + c 3 \x -y\ + e 

< sup C(x, u, /3f [u]) + c%\x — y\ + e 

< + c 3 |x - y| + 2e. 

Since x,y £ G and e > are arbitrary, and C3 does not depend on them or on b, V~ is Lipschitz, 
uniformly for 6 6 [b* , 00] . 

In case that Conditional holds, the same argument shows that V~(y) < V~(x) + c 3 \x — y\ + 2e, 
where a = c\\x — y\. By LemmaHJ this implies that V~(y) < V~(x) + c^x — y \ + 2e, some constant 
C4, and therefore V~ is Lipschitz. 

Next, consider the upper value 

V + {x) = supinf C(y, a[m],m) 

a m 

under Condition^2. Let x,y £ G. Note that there is an af such that 

V + (x) < inf C(x,a x t [m],m) + e. 

and an m e = m e (x,y) for which 

V + {y) > inf C(y,a*[m],m) 

> C(y,a*[m e ],m e ) - e. 

Let a z = inf{i : (fr z G}, where 4>z is the solution to <p = 7r(</>, u(af [m e ],m e )), with 0(0) = z. 
Let i* be defined in an analogous way to that in the first paragraph of the proof. Now define 
fh e = fh e (x,y) as follows. If a x < a y , let rh 6 = m e . If a y < a x , let m t agree with m e on [0,(7^) and 
with ttiq on [a y , cr x ]. Here, mo sets all fii and all Aj to zero, except that it sets Aj* = Aj*. Since m e 
and fh e agree on [0, <7y), the restrictions to [0,a y ] of af[m e ] and of a x [fh e ] agree a.e. on [0, a^], and 
therefore, 

C(y,a*[m e ],m e ) = C(y,a x [fh € ],fh e ). 
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Arguing again by the Lipschitz property of the Skorokhod map and the definition of mo, there is a 
constant C4 for which (a x — a y ) + < c^x — y\. Hence 

V + {y) > C(y,a x [m e ],fh € )-e 

> I [c + p(a x £ [fh x e ](s),rh x e (s))]ds - l ay<ax / [c + p(a x [m x ](s), m x (s))]ds - e 

JO J (Ty 

> C(x,a x [m e ],fh € ) — c$\x — y\ — e 

> inf C(x, a x [m], m) — c§\x — y\ — e 

> V + (x) — c$\x — y\ — 2e. 

Since C5 does not depend on x, y, e or 6, we have that V + is Lipschitz uniformly for b G [b* , 00]. 

Under Conditional, the same argument shows that V^ + (y) > V + (x) — c§\x — y\ — 2e, where 
a = c\\x — y\, and again one argues by Lemma |I] I 

Proof of Lemma The processes are constructed recursively using a sequence of standard 
exponential clocks. Recall that C n ' u,m is given for every n, u € U, m E M by 

J J 
C n,u,m f{x) = Y^n\ J [f{x + n- 1 v j ) - f(x)] + J2nfim[f(x + n - 1 n(x,v i )) - f(x)}. 
j=i i=i 

Given n, x n and (3, we construct a filtered probability space and three processes, X(t), u(t) and 
m(t) (to simplify notation, we do not write the superscript n in the notation of X n , u n and m n ) 
such that (a) X,u and m are (F t )-adapted; (b) m(i) = (3[u](t) a.e. t > 0, a.s.; (c) u(-) = u n (X(-)) 
a.s. (where u n is as in the statement before the lemma); and (d) for any /, the process 

f{X(t))- f C n ^ s)Ms) f{X{s))ds 



is an (^)-martingale. For (a-d) to hold, it suffices that (a-c) hold, and (e) on any finite interval 
the process X jumps finitely many times — we denote the feth jump by T k and let tq = 0; (f) the 
random times (r k ) are stopping times on (Ft), and (g) denoting X k = X(r k ), for any k, 



where 



E[f{X k+1 ) - f(X k )\F Tk ] = £ E[At>*>« + B^\F Tk ], 

A k,u,fn = n n** - Xi (s)ds[f{X k + n~\) - f(X k )}, 
B k,u,fn = n f Tk+1 -^ s)u . {s)ds[f (x k + n-\{X k ,Vi)) - f(X k )}. 

The construction is recursive. On a complete probability space (£l,F,P) we are given 2J 
independent i.i.d. standard Poisson processes, denoted dj and bi, i = 1,..., J. Let T^ik) [resp., 
Zf(fc)] denote the first time o« [resp., bi] equals k. For each u> £ ft we construct recursively a 
sequence of times (r k ) and the processes X,u and m up to time T k . Once these processes are 
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defined, we will define (Ft), F t C F, t > 0, and verify that items (a-c), (e-g) are satisfied on 

(n,F,(F t ),P). 

We set X(0) = x n and u(0) = u n (x n ). Since fh need only be denned almost everywhere on 

[0, oo), we do not define it at zero nor at any r k , k = 1, 2, Now assume that we have constructed 

Ti, i < k as well as the processes X and u on [0,T k ] and fh a.e. on [0, Tfc]. Let u k (t) = u n (X(t Ar k )), 
t > 0. Let also m fc = With u fc (-) = («*(•)) and m fc (-) = ((A* (•)), (A?("))) ; let 

= n / Aj(s)ds, qf(t) = n / p^(s)u^(8)ds, i = l,...,J, i > 0. 
Jo Jo 

Denoting Az(s) = z(s) — z(s-), let also 

r k+ i = inf{i > T fc : either Aa,i(p k (i)) > or Abi(q k (t)) > for some i = 1, . . . , J}, 

where inf = +oo. We first consider the case that Tk+i < +oo. In this case, 

there is i such that either Aa^p 1 - (rk+i)) > or A6j(gf (rk+i)) > 0. (62) 

In the former case we let v k = vf, otherwise we let v k = Vi. 

The three processes are defined on the next interval as follows. Let X(t) = X(t}.) for t € 
(rk,Tk+i), and Xfa+i) = X(r k ) + n _1 7r(X(r fe ), v k ). Let u(t) = u n (X(t)) for t G (T k ,r k+1 ]. Let 
u(t) = u n (X(t A T k+ i)) and define fh(t) = (3[u](t), t G [0,Tfe+i]. Note that since (5 is a strategy, 
this definition of fh is consistent with its definition up to r k since so is the definition of u. For the 
same reason, for a.e. t < T k +i, fh(t) = m k (t). In particular, the equations for p k , q k still hold if we 
replace hats by bars, namely, 

ft _ ft 
Pi(t)= n K(s)ds, q k (t) = n p,i(s)ui(s)ds, i = 1, . . . , J, r k < t < r k+1 . (63) 
Jo Jo 

Note that the above relations are consistent in the sense that for a given k, they hold not only 
for t £ [Tfc,Tfc + i], but in fact for t £ [0,T k+ i]. Hence, on the event r k — > oo, one can equivalently 
consider the processes 

Pi(t) = n [ Xi(s)ds, qi(t) = n [ p,i(s)ui(s)ds, i = l,...,J, t>0. (64) 
Jo Jo 

This completes the definition of the three processes on [0,Tfc + i]. 

In case that T k+ \ = +oo, the definitions above of X, u and fh all apply on (r k ,T k+ \) and there 
is nothing else to define. 

To complete the construction of the three processes on x [0, +oo), we must consider the set f^o 
of lj e 0- for which T = sup T k is finite. We show that this set is P-null owing to the fact that the 
range M b of (3 consists of bounded functions. Suppose T is finite. The construction above defines 
X,u and fh on [0, T). Let u'(t) = u(t) for t <T and define u'(t) arbitrarily on [T, +oo) but such 
that v! G U. Then fh' = (3[u'\ agrees with fh a.e. on [0, T]. Since each component of fh' is bounded 
by b, 

n~ l max[pi(T) V qAT)] < 2JTb < +oo. (65) 
i=i 
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However, by construction, T < oo implies that either a,i(pi(t)) — > oo or bi(qi(t)) — > oo as i f T, for 
some i. Hence T < oo must be a null set. We let X,u and to be defined arbitrarily on f^o- 

The definition of the process Y is similar to that of X, but where ir(x, v) is replaced by v 
throughout. The relation X = T(Y) is clear from the construction. 

Define for each t > F t to be the u-field generated by {Y(s),s G [0, t]}. Note that it is 
equivalently defined as the cr-field generated by {ai(pi(t)), h(qi(t)), i = 1,..., J}, where Pi,qi are 
as in l|64j). By construction, u(t) = u n (X(t)), t > and item (c) holds. Item (b), namely that 
to = f3[u], also holds by construction. X and u are therefore (Ti)-adapted, and since (3 is a strategy, 
so is m, and item (a) holds. Items (e) and (f) are trivial. Concerning (g), let i k G {1, . . . , 2 J} denote 
the index i satisfying (|62j) in case that Aai(pf(t)) > holds, and let it denote % + J in the case 
Abi(qf(t)) > 0. It suffices to show that for every i E {1, . . . , 2J}, 



P(i fc = i\F, 



E[r T k k +1 Pl {s)ds\F Tk ]/Z k i < J, 
E[j; k k+1 qi-j(s)ds\F Tk ]/Z k i > J, 



where is a normalization factor (not depending on i). For k = (t^ = 0), this is a well known 
property of exponential clocks. For k > 0, the same argument holds, merely because conditional on 
F Tk , the processes J Tk Pi(s)ds, J Tfc qi(s)ds are independent, and moreover, Oj(- — r^) — aj(rfc),6j(- — 
r fc) — bi(T~k) are still independent Poisson processes (which is a statement on the lack of memory for 
exponential random variables). 

The proof of the claim regarding the martingale associated with Cq is similar (only simpler). 
This completes the proof of the first part of the lemma. 

Clearly, 

max pi (T ) V qi(T ) < nT b, 
where To is as in Lemma[SJ Thus, if N n = max{/c : < To}, then 

X n < ai(nT b) + bi(nT b), 

i 

and (HH) follows. ■ 



Proof of Lemma |§J The proof is completely analogous to that of Lemma and is therefore 
omitted. ■ 

Proof of TheoremSJ By Theorems El and |H1 V b ~ = V b,+ for all b G [b*, oo). As a result, Theorem 
El implies that V n -> V b ~ for all b G [b* , oo), as in oo. In particular, V b) does not depend on 
b G [b*, oo). It remains to show that for all x, V b, ~(x) —> V~(x) and V b,+ {x) —> V + (x) as b — > oo. 

Proof that V b, ~ — > V~ . It is immediate from the definitions that V" < V b ~. 

Let (3 G B, and let a = a(x,u,(3) be the exit time of <b from G where <ft = n((f>, v(u, 0[u])), 
0(0) = x. Let (3 be defined by 



min{b, p[u](t)} t < a, 
m t > a, 
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where rh sets all p,j = and Xj = lj=j^ {(T) , and the minimum is componentwise. It is clear that (3 
is a strategy. Let u be any extension of u to [0, oo), and denote by eft and a the dynamics and exit 
time corresponding to x,(3,u. Recall that by © 6 is greater than all Aj and fa. Thus 



C(x,u,/3[u]) = / (c + p(u, P[u])ds + l„ >a / (c + p(u,m))ds 

JO Ja 

< C(x,u,f3[u\) +ci(a - cr)+. 
Moreover, by the Lipschitz property of the Skorokhod map, and denoting /3[u] = (Aj, fa), 

(a- a)+ <c 2 |0((7)-^(a)| <c 2 I* ~ b) + + (uifii - b) + ]ds 

Jo i 

Since it is enough to consider (3 for which (for any u) Aj and Uifa are uniformly integrable over [0, a], 
we have that (a — a) + < 5(6), where 5(6) — > as 6 — > oo. This shows that lim^oo V b, ~(x) < V~(x). 

Proof that V b ' + -» F+. It is immediate that V + < V b ' + . 

To show that V + (x) > lim^^ V b ' + {x) it is enough to show that for 6 > 6* and a small, 
V + (x) > V_' a (x). For any m 6 M let m b denote the pointwise and componentwise truncation of 
to at level 6. For any a € A, let a 6 € ^4 be defined by a b [m] = a[m 6 ]. We will write m G M(a, a) 
if m, a, a satisfy C a (x, a[m], to) < V^ + (x) + 1. In the expression for V^(x), 

sup inf C a (x, a[m],m), 
a m 

it is enough to consider a £ A and to € M(a, a) (including for a = 0). For such a, to, the functions 
Xi,Uifii are uniformly integrable over [0, T]. Let a £ A and m £ M(a b ,0). Consider a truncation 
of Aj and fa at 6. Denote by 4> [resp., 4> b ] the dynamics that correspond to (x,a b ,m), [resp., 
(x, a b , m b )\. Then the effect of the truncation on <fi is such that for all a > there is 6 such that 
su P[o,T] 10 ~~ 6 | ^ a (by uniform integrability) . In particular, |0 (o") — 0(cr) ] < a. Hence, using the 
monotonicity of the running cost for large values of the rates, and that a b [m b ] = a 6 [to], 

C„ a (x,a b [m b ],m b ) < C(x, 

We thus have 

C- a (x, a[m], m b ) < C(x,a b [m],m). 
Since m £ M(a b ,0) implies that m b £ M(a b , —a), 

inf C- a (x,a[m],m) < inf C- a (x, a[m b ], m b ) 

m£M b m:m b £M(a b ,-a) 

< inf C(x,a b [m],m) 

meM(a l ,0) 



Hence 



inf_ C(x,a [to], m) 



sup inf C- a (x, a[m], m) < sup inf_ C(x, a[m], to). 



Taking a — > by letting 6 — > oo, we have from Lemma 0] that lim;, y 6 ' + (x) < y + (x). 
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